## This Page Is Inserted by IFW Operations and is not a part of the Official Record

## **BEST AVAILABLE IMAGES**

Defective images within this document are accurate representations of the original documents submitted by the applicant.

Defects in the images may include (but are not limited to):

- BLACK BORDERS
- TEXT CUT OFF AT TOP, BOTTOM OR SIDES
- FADED TEXT
- ILLEGIBLE TEXT
- SKEWED/SLANTED IMAGES
- COLORED PHOTOS
- BLACK OR VERY BLACK AND WHITE DARK PHOTOS
- GRAY SCALE DOCUMENTS

## IMAGES ARE BEST AVAILABLE COPY.

As rescanning documents will not correct images, please do not report the images to the Image Problem Mailbox.

## WHAT IS CLAIMED IS

|   | 1. A processor comprising:                                                                   |
|---|----------------------------------------------------------------------------------------------|
| 2 | an execution unit to execute instructions;                                                   |
| 3 | a replay system coupled to the execution unit to replay instructions which have not executed |
| 1 | properly, the replay system comprising:                                                      |
| 5 | a checker to determine whether each instruction has executed properly; and                   |
| 6 | a plurality of replay queues, each replay queue coupled to the checker to temporarily        |
| 7 | store one or more instructions for replay.                                                   |
| 1 | 2. The processor of claim 1 wherein said plurality of replay queues comprises:               |
| 2 | a first replay queue coupled to the checker to temporarily store instructions corresponding  |
| 3 | to a first thread; and                                                                       |
| 4 | a second replay queue coupled to the checker to temporarily store instructions corresponding |
| 5 | to a second thread.                                                                          |
| 1 | 3. A processor comprising:                                                                   |
| 2 | an execution unit to execute instructions;                                                   |
| 3 | a replay system coupled to the execution unit to replay instructions which have not executed |
| 4 | properly, the replay system comprising:                                                      |
| 5 | a checker to determine whether each instruction has executed properly; and                   |

| 6  | a replay queue coupled to the checker to temporarily store one or more instructions                   |
|----|-------------------------------------------------------------------------------------------------------|
| 7. | of a plurality of threads for replay, the replay queue partitioned into a plurality of sections, each |
| 8  | section provided for storing instructions of a corresponding thread.                                  |
|    |                                                                                                       |
| 1  | 4. The processor of claim 3 wherein said plurality of replay queue sections comprises:                |
| 2  | a first replay queue section coupled to the checker to temporarily store instructions                 |
| 3  | corresponding to a first thread; and                                                                  |
| 4  | a second replay queue section coupled to the checker to temporarily store instructions                |
| 1  | corresponding to a second thread.                                                                     |
| •  |                                                                                                       |
| 1  | 5. The processor of claim 3 wherein said replay system further comprises:                             |
| 2. | a replay loop to route an instruction which executed improperly to an execution unit for              |
| 3  | replay; and                                                                                           |
| 4  | a replay queue loading controller to determine whether to load an improperly executed                 |
| 5  | instruction to the replay loop or into one of the replay queue sections.                              |
| •  |                                                                                                       |
| 1  | 6. The processor of claim 3 and further comprising:                                                   |
| 2  | a scheduler to output instructions; and                                                               |
| 3  | a multiplexer or selection mechanism having a first input coupled to the scheduler, a second          |
| 4  | input coupled to the replay loop and a plurality of additional inputs, each additional input coupled  |
| 5  | to an output of one of the replay queue sections.                                                     |

| 1.  | 7. The processor of claim 3 wherein each said replay queue section comprises a replay queue           |
|-----|-------------------------------------------------------------------------------------------------------|
| 2   | section coupled to the checker to temporarily store one or more long latency instructions of a thread |
| 3   | until the long latency instruction is ready for execution.                                            |
|     |                                                                                                       |
| 1   | 8. The processor of claim 3 wherein each replay queue section comprises a thread-specific             |
| 2   | replay queue section coupled to the checker to temporarily store an instruction in which source data  |
| 3   | must be retrieved from an external memory device, the instruction being unloaded from the replay      |
| 4   | queue section when the source data for the instruction returns from the external memory device.       |
| •   |                                                                                                       |
| 1   | 9. The processor of claim 3 wherein said execution unit is a memory load unit, the processor          |
| 2 - | further comprising:                                                                                   |
| 3   | a first level cache system coupled to the memory load unit;                                           |
| 4   | a second level cache system coupled to the first level cache system; and                              |
| 5   | wherein the memory load unit performs a data request to external memory if there is a miss            |
| 6   | on both the first level and second level cache systems.                                               |
|     |                                                                                                       |
| 1   | 10. The processor of claim 9 wherein a load instruction of a thread will be loaded into a             |
| 2   | replay queue section corresponding to the thread when there is a miss on both the first level and     |
| 3   | second level cache systems for the load instruction, and the load instruction is unloaded from the    |

| 4  |   | replay queue section corresponding to the thread for re-execution when the data for the instruction     |
|----|---|---------------------------------------------------------------------------------------------------------|
| 5  |   | returns from the external memory.                                                                       |
|    |   |                                                                                                         |
| 1  |   | 11. A processor comprising:                                                                             |
| 2  | • | a multiplexer having an output;                                                                         |
| 3  |   | a scheduler coupled to a first input of the multiplexer;                                                |
| 4  |   | an execution unit coupled to an output of the multiplexer;                                              |
| 5  |   | a checker coupled to the output of the multiplexer to determine whether an instruction has              |
| 6  |   | executed properly;                                                                                      |
| 7  |   | a plurality of thread-specific replay queue sections to temporarily store instructions for each         |
| 8  |   | of a plurality of threads, an output of each of the replay queue sections coupled to additional inputs  |
| 9  |   | of the multiplexer; and                                                                                 |
| 10 |   | a controller coupled to the checker to determine when to load an instruction into one of the            |
| 11 |   | replay queue sections and to determine when to unload the replay queue sections.                        |
|    |   |                                                                                                         |
| 1  |   | 12. The processor of claim 11 and further comprising a staging section coupled between the              |
| 2  |   | checker and a further input to the multiplexer to provide a replay loop, the controller controlling the |
| 3  |   | multiplexer to select either the output of the scheduler, the replay loop or an output of one of the    |
| 4  |   | replay queue sections.                                                                                  |

| 1   | 13. The processor of claim 11 wherein the controller determines when to unload one or more              |
|-----|---------------------------------------------------------------------------------------------------------|
| 2 . | of the replay queue sections based on a data return signal.                                             |
|     |                                                                                                         |
| 1   | 14. A method of processing instructions comprising:                                                     |
| 2   | dispatching an instruction where the instruction to an execution unit and to a replay system;           |
| 3   | determining whether the instruction executed properly;                                                  |
| 4   | if the instruction did not execute properly, then:                                                      |
| 5   | determining whether the instruction should be routed back for re-execution or whether                   |
| 6   | the instruction should be temporarily stored based on a thread of the instruction.                      |
|     |                                                                                                         |
| 1   | 15. A method of processing instructions comprising:                                                     |
| 2 . | dispatching an instruction where the instruction is received by an execution unit and a replay          |
| 3   | system;                                                                                                 |
| 4   | determining whether the instruction executed properly;                                                  |
| 5   | if the instruction did not execute properly, then:                                                      |
| 6   | routing the instruction to the execution unit for re-execution if the instruction is a first            |
| 7:  | type of instruction;                                                                                    |
| 8   | otherwise, loading the instruction into one of a plurality of thread-specific replay                    |
| 9   | queue sections based on a thread of the instruction if the instruction is a second type of instruction. |

| 1   | 16. The method of claim 15 wherein the first type of instruction comprises a short latency           |
|-----|------------------------------------------------------------------------------------------------------|
| 2 - | instruction, and the second type of instruction is a longer latency instruction.                     |
| •   |                                                                                                      |
| 1   | 17. A method of processing instructions comprising:                                                  |
| 2   | initially allocating execution resources for multiple threads;                                       |
| 3   | determining that a first thread has stalled;                                                         |
| 4   | temporarily storing one or more instructions of the first thread in a queue; and                     |
| 5   | continuing to allocate execution resources to other threads which have not stalled                   |
|     |                                                                                                      |
| 1   | 18. The method of claim 17 wherein said step of continuing to allocate comprises the step            |
| 2   | of continuing to allocate execution resources to other threads which have not stalled and inhibiting |
| 3   | the allocation of further resources to the stalled thread by temporarily storing the stalled thread  |
| 4   | instructions in the queue.                                                                           |
|     |                                                                                                      |
| 1   | 19. The method of claim 17 wherein priority for execution resources are allocated to the other       |
| 2   | threads which have not stalled on a rotating priority basis.                                         |
|     |                                                                                                      |
| 1   | 20. The method of claim 17 and further comprising the steps of:                                      |
| 2   | detecting that the first thread is no longer stalled;                                                |
| 3   | unloading the one or more instructions of the first thread from the queue; and                       |
| 4   | re-allocating at least some execution resources to the first thread.                                 |

- 21. The method of claim 17 wherein the step of determining that a first thread has stalled
- 2 comprises detecting a long latency or agent instruction for the first thread.